Memorization in Deep Neural Networks: Does the Loss Function Matter?
نویسندگان
چکیده
Deep Neural Networks, often owing to the overparameterization, are shown be capable of exactly memorizing even randomly labelled data. Empirical studies have also that none standard regularization techniques mitigate such overfitting. We investigate whether choice loss function can affect this memorization. empirically show, with benchmark data sets MNIST and CIFAR-10, a symmetric as opposed either cross entropy or squared error results in significant improvement ability network resist then provide formal definition for robustness memorization theoretical explanation why losses robustness. Our clearly bring out role functions alone play phenomenon
منابع مشابه
Detecting Learning vs Memorization in Deep Neural Networks using Shared Structure Validation Sets
Abstract: The roles played by learning and memorization represent an important topic in deep learning research. Recent work on this subject has shown that the optimization behavior of DNNs trained on shuffled labels is qualitatively different from DNNs trained with real labels. Here, we propose a novel permutation approach that can differentiate memorization from learning in deep neural network...
متن کاملA Closer Look at Memorization in Deep Networks
We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness. While deep networks are capable of memorizing noise data, our results suggest that they tend to prioritize learning simple patterns first. In our experiments, we expose qualitative differences in gradient-based optimization of deep neural networks (DNNs) on noise vs...
متن کاملWhy Deep Neural Networks for Function Approximation?
Recently there has been much interest in understanding why deep neural networks are preferred to shallow networks. We show that, for a large class of piecewise smooth functions, the number of neurons needed by a shallow network to approximate a function is exponentially larger than the corresponding number of neurons needed by a deep network for a given degree of function approximation. First, ...
متن کاملThe loss surface and expressivity of deep convolutional neural networks
We analyze the expressiveness and loss surface of practical deep convolutional neural networks (CNNs) with shared weights. We show that such CNNs produce linearly independent features (and thus linearly separable) at every “wide” layer which has more neurons than the number of training samples. This condition holds e.g. for the VGG network. Furthermore, we provide for such wide CNNs necessary a...
متن کاملThe Loss Surface of Deep and Wide Neural Networks
While the optimization problem behind deep neural networks is highly non-convex, it is frequently observed in practice that training deep networks seems possible without getting stuck in suboptimal points. It has been argued that this is the case as all local minima are close to being globally optimal. We show that this is (almost) true, in fact almost all local minima are globally optimal, for...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2021
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-030-75765-6_11